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CLOSED FORM EXPRESSIONS FOR BAYESIAN SAMPLE SIZE 

By B. Clarke and Ao Yuan 

University of British Columbia and Howard University 

Sample size criteria are often expressed in terms of the concen- 
tration of the posterior density, as controlled by some sort of error 
bound. Since this is done pre-experimentally, one can regard the pos- 
terior density as a function of the data. Thus, when a sample size 
criterion is formalized in terms of a functional of the posterior, its 
value is a random variable. Generally, such functionals have means 
under the true distribution. 

We give asymptotic expressions for the expected value, under a 
fixed parameter, for certain types of functionals of the posterior den- 
sity in a Bayesian analysis. The generality of our treatment permits 
us to choose functionals that encapsulate a variety of inference cri- 
teria and large ranges of error bounds. Consequently, we get sim- 
ple inequalities which can be solved to give minimal sample sizes 
needed for various estimation goals. In several parametric examples, 
we verify that our asymptotic bounds give good approximations to 
the expected values of the functionals they approximate. Also, our 
numerical computations suggest our treatment gives reasonable re- 
sults. 

1. Introduction. Suppose = (Xi, . . . ,Xn) is IID p{-\6), where the 
d-dimensional parameter 9 ranging over Q C R'^ is equipped with a prior 
probability W{-) having density w^O) with respect to Lebesgue measure. 
Given an outcome = {xi, . . . ,Xn) of X", Bayesian inference is based 
on the posterior density w{9\x^) = w{6)p{x'^\9) /m{x^) , where m(x") = 
/ w{9)p{x"'\9) d9 is the mixture density. Once a prior, likelihood and para- 
metrization for 9 are specified, the main pre-experimental task is to choose 
the sample size n. The size of n will depend on the degree of accuracy desired 
and on the sense in which that accuracy is to be achieved. 

Sample size determination in the Bayesian setting is an important and 
practical problem. As yet there is no general and accepted asymptotically 
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valid closed form expression, such as we give here, that can be readily used 
to give minimally necessary sample sizes to achieve pre-specified inference 
objectives, even in seemingly simple cases. For instance, it has taken a series 
of papers (see [19] and the references therein) to provide a reasonable treat- 
ment for the difference of two proportions with independent Beta densities 
under a variety of criteria. 

The lack of general expressions may be, in part, because the inferential 
criteria that have been used fall into three distinct classes. First, in the 
absence of a loss function, one often looks at properties of credibility sets — 
average length of the highest posterior density regions for instance. While 
this is often reasonable, the downside is that criteria that look for the worst 
case scenario often require overlarge sample sizes; see [14]. One way to correct 
for this is to include the cost of sampling in the optimality criterion. 

Second, when a loss function is available, the decision theoretic approach 
originated by Raiffa and Schlaifer [20] can be used. One benefit of this ap- 
proach is that it is easy to include the cost of sampling. The decision theo- 
retic approach was developed in [18]. See also [1] and [16] for an information 
perspective; Pham-Gia and Turkkan ([19], Section 4) provided some gen- 
eral comments. Cheng, Su and Berry [3] established asymptotic expressions 
for sample size computation in the clinical trial context for dichotomous 
responses. A general discussion of the relative merits of decision theoretic 
approaches to sample size problems can be found in [14, 17, 18]. 

A third class of treatments of the sample size problem is more "eviden- 
tiary" : These techniques tend to be based on hypothesis testing criteria such 
as Bayes factors (see [6, 7, 15]) or robustness; see [8]. The predictive prob- 
ability criterion of [9], the distance between the posterior predictive density 
and the density updated on additional observations, and the direct evalua- 
tion of probabilities of events in the mixture distribution (see [4]) fall into 
this conceptual class as well. Since Bayesian testing can be framed as a deci- 
sion problem, this third class can be regarded as a special case of the second 
class. However, the emphasis is different. Decision theoretic approaches tend 
to emphasize risks and expectations, while evidentiary approaches tend to 
focus on conditional probabilities, often posterior probabilities of hypothe- 
ses. 

Because of this multiplicity of mathematically challenging criteria, it is 
not easy to parallel frequentist formulations. Nevertheless, many of these 
criteria can be represented as functionals F, not in general linear, of the 
posterior distribution VF(-|X"). For such cases, we provide a unified frame- 
work, indicating how it can be adapted to various settings. 

Our overall goal is to give simple closed form asymptotic expressions in 
the form of inequalities that can be solved to give sample sizes. The reader 
interested primarily in these expressions can find four of them in Section 4, 
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noted (APVC), (ACC), (ALC) and (ES), to indicate the criteria. [Expres- 
sions for similar cases are in Theorem 3.3 and in the Appendix; see (A. 10), 
(A. 11) and (A. 13).] Informahy, our central strategy for obtaining these ex- 
pressions is the standard technique of approximating the leading term in 
an expansion of the expectation of a functional. Recall that T^(-|X") is 
asymptotically (") under Pg in an sense. Here, $/^,r2(-) is the 

distribution function for a Normal(/x, $7), with density denoted 4'fi,n{'), and 
9 is the maximum likelihood estimator (MLE), with asymptotic variance 
at a value 9 given by the positive definite inverse Fisher information ma- 
trix I{9)~^ . If ^0 is the data generating parameter, adding and subtracting 



^eo^(^,;(„/(eo))-i(-)) gives 

(1.1) i?eoi^(H^(-|^")) = ^^o^(^^,(n/(eo))-i(-)) + ^eo^n(F), 
where R^{F) = [Ee,F{W{-\X'')) - ^e,F($g (^^^^^^^.i (•))] is the remainder 



term and F is a functional on distributions, that is, for any distribution 
Q, F{Q) G M. Our hope is that the remainder term will be small enough 
compared to the difference of the other two terms that (1.1) will permit 
asymptotically valid closed form expressions for the sample size criterion 
encapsulated by F. 

1.1. An example of the techniques. Our verification that the remainder 
term in quantities like (1.1) is typically small rests on the foundational work 
of Johnson [10, 11], who developed Edgeworth style approximations for the 
posterior and certain posterior derived quantities such as percentiles and 
moments. Indeed, Edgeworth expansions and Johnson-style asymptotic ex- 
pressions provide asymptotic control for the values of both terms on the 
right-hand side in (1.1), as oo, for various choices of F. 

To see how these asymptotic expressions can be used to approximate the 
leading term of (1.1), and that the remainder term can be small compared 
to it, consider the following example. It is paradigmatic of our approach in 
its use of Johnson and Edgeworth expansions. The specific result can be 
obtained more readily by other techniques; however, our point is only to 
exemplify the reasoning informally. 



Set F(W(-|X")) = Fa{W{-\X'')) = W{Dn\X''), where Dn = (-00,0^(0)) 
and a„ = a„(a) = a„(a,X") is the ath quantile under the posterior distri- 
bution W{-\X''). Next, set 



in which Z,„ is an asymptotically standard normal random sequence of ran- 
dom variables. It is seen that D'^ is the region corresponding to D„ but 
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under ^z„,(n/(0o))"K")> which we have used Z„, in place of 9 by asymp- 
totic normahty of the MLE. That is, D'^ approximates D„. In this case, the 
first term on the right-hand side of (1.1) is 



^eo^Z„,(n/(9o))-i(^n; 



(1.2) 



1 



2tt 



e-''/'dt = a. 
27r J— CO 



The remainder term in (1.1) is 

Posterior normahty suggests (1.3) — > 0, but we want a rate that is smah 
relative to the rate of convergence of the left-hand side of (1.1) to (1.2) 
which we take to be o(l). We ignore details on this latter rate since it is not 
the point. Now, to get a rate for (1.3) 0, we use a modification of Johnson 
([11], Theorem 5.1); it is justified below in Theorem 2.1. Thus, we have that 
quantiles such as a„ satisfy 



an = (n/(^o))''/' 



$ ^(q) +^rj(a)n 



-i/2 



+ 0(n 



-(J+l)/2 



where the Tj's are polynomials with bounded coefficients that depend on the 
data X" , and J > 1 . Now, we can write 



E'en I a 



n "n 



Eeo 



<1> ^{a) + ^Tj{a)n 



-j/2 



+ 0(n 



-{J+l)/2 



+ 1 



(1.4) 



= Ee,\en-Zn\+0{n-^'^) 

< Eejn - Bo\ + Ee,\Zn - ^ol + 0{n-^/^) 
= n~'/'r'/'{9o){Eg„\V^l'/\6o){9n-eo)\ 

+ Ee,\V^l'^\9o)iZn - 9o)\) 

+ 0(n-i/2). 
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Expression (1.4) can be controlled by using an Edgeworth expansion 
for the density of 6 under in the first term in parentheses, namely, 
EeaV^I^^'^{^o)0n — do)- Using this approximation and recognizing limiting 
normal forms gives that, term by term, (1.4) is 

V k=i •' 

+ o(n-^/2) j _-|£|^dz + j \z\ct>{z)d^ +0(n-V2). 

So, (1.3) is 0(l/-y/n) and the left-hand side of (1.1) is 

(1.5) £9oF«(W(-|X")) = a + o{l) + 0(n-i/2), 

that is, the expected Bayesian coverage probability is always a + o(l). 

Improving (1.5) leads to inequalities that can be solved to give sample 
sizes. That is, careful use of the Edgeworth and Johnson expansions that we 
used to control (1.3) and (1.4) will give an error term of order o{l/^/n). So, 
we can find = N{e) large enough that, for a specified range of parameter 
values 0, we would have \EQFaiW {■\X'^)) — a\<e lov n> N . Details on this 
case are given below in Example 3 of Section 4. The "nicest" cases occur 
when the first term in (1.1) is independent of the value of 6 and the second 
term goes to zero. As suggested by the form of (1.2), when the first term in 
(1.1) depends on an estimator such as a„ or 9, we expect an asymptotically 
normal random variable Z„ to appear in the limit. In these cases, we want 
the second term of (1.1) to go to zero at a fast enough rate. Thus, we want 
to give an expansion for it as a sum of powers of 1/ \/n times evaluations of 
expectations. 

1.2. Expected values of junctionals of the posterior. Before proceeding 
with the mathematical formalities, we suggest that the formulation we have 
adopted here — representing sample size criteria as expectations of function- 
al of the posterior — is the right one, in the sense that it is general enough to 
encapsulate all the important cases, yet narrow enough to permit straight- 
forward analysis and use. 

The three classes identified earlier — Bayes credibility, decision theoretic 
and evidentiary — suggest that many authors have, implicitly or explicitly, 
studied criteria that amount to functionals of the posterior, if not expecta- 
tions of them. Indeed, the pure Bayes and evidentiary approaches amount 
to studying functionals of the posterior and most of the decision theoretic 
optimality criteria can be written as functionals of the posterior; most of- 
ten these are clearly expectations. Moreover, taking expectations over the 
sample space pre-experimentally is standard Bayesian practice for design 
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problems. This is done in [23], for instance, an approach that motivated the 
present work. Wang and Gelfand proposed a simulation based technique for 
determining a sample size large enough to achieve various pre-experimentally 
specified criteria. 

All the criteria used in [23] are special cases of the form E{T{Y)) < e, 
where T is a nonnegative function in which the data Y appears via condi- 
tioning; see [23], Section 2, equation (6). Their simulation technique has a 
broad scope of application, and should be at least as accurate as approxi- 
mations based on asymptotic expansions. The special cases of F we use here 
are taken from [23]. 

We comment that some of the criteria used in Wang and Gelfand's sim- 
ulations, for instance, the average cover criterion, ACC, and average length 
criterion, ALC, have been studied mathematically. For instance, Joseph and 
Belisle [12] and Joseph, du Berger and Belisle [13] derived inequalities the 
sample size must satisfy under certain prior specifications for normal and 
binomial models. Wang and Gelfand's work [23] is important because these 
special cases may not cover all the settings of interest. 

Unfortunately, simulations may not always be easy to do. Moreover, the 
distinction between the sampling and fitting priors used in [23] may be a 
layer of conservatism that is not necessary. Aside from computational ease, 
Sahu and Smith ([21], Section 2.3) argue that using sampling and fitting 
priors permits weaker assumptions for the validity of inference. However, one 
could use a single objective prior for both sampling and fitting purposes to 
achieve essentially the same inferential validity. In either case, there remains 
a role in Bayesian experimental design for a good closed form expression for 
sample sizes. 

Expression (1.1) suggests a different tack for obtaining the kind of closed 
form expressions we want. One could approximate EgF{W{-\X^)) by 
EgF{N(6, (nl{9))~^)), where iV is a Laplace approximation to the poste- 
rior, instead of a Johnson style expansion. The two approaches — Johnson 
and Laplace — probably require similar hypotheses. Arguably, the Laplace 
expansion is conceptually easier. However, Johnson expansions give an ap- 
proximation to F{W{-\X"')) directly rather than separately approximating 
F and VF(-]X"). One could use more terms in the Laplace approximation, 
evaluate F on those terms, and then approximate F, but the complexity 
would likely exceed what we have done here. The Johnson expansions are 
readily available and more direct, although a confirmatory treatment using 
Laplace's method would be welcome. 

The structure of this paper is as follows. Section 2 gives the theoretical 
context of our work: We observe generalizations of key results in Johnson 
[11] and state the version of Edgeworth expansions we will need. Then, we 
give a simple result. Proposition 2.1, that formalizes the strategy implicit 
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in (1.1). It seems that getting an asymptotic expression for general func- 
tionals F is a hard problem so, in Section 3, we give asymptotic expressions 
for three kinds of terms that often arise in special cases of functionals of the 
posterior density. Two of these theorems are derived from [11], and one is 
new. The most technical arguments from this section are relegated to the 
Appendix at the end. Section 4 uses our main results to show how four es- 
tablished criteria for sample size determination admit asymptotically valid 
closed form expressions. In Section 5 we compare the results of our asymp- 
totic expressions to closed form expressions obtained from three exponential 
families equipped with conjugate priors. It is seen that our asymptotic ex- 
pansions typically match the leading \l \fn terms in those cases. In addition. 
Section 5 presents numerical results which confirm our approximations are 
reasonably accurate. 

2. Theoretical context. We consider the case that F is a functional on 
distributions such as the posterior = x") for a parameter. We as- 

sume F represents something about how distributions concentrate at a spe- 
cific value in their support. Our interest here focuses on the class of F only 
in that we want to include the commonly occurring sample size criteria used 
in [23]. 

We will need two assumptions to control the leading term in an expansion 
for E{F). The first is drawn from [11], Theorem 2.1: The expectation of the 
functional of the posterior, EF{W{-\X"')) minus its normal approximation 
[see (1.1)] must have an expansion of the form established by Johnson [11]. 
The second assumption is that the classical Edgeworth expansion can be 
used to approximate the sampling distribution of 0„ when 6 is taken as 
true. 

To begin, we make Assumptions 1-9 in [11], modifying them only by 
permitting 6 to range over a set C M'^. Together, these are the standard 
"expected local sup" conditions that ensure the consistency, asymptotic nor- 
mality and efficiency of the MLE. Assumption 8, for instance, bounds the 
first two derivatives of logp{x\6) by an integrable function so that, when 
d = l, 

m = --T.Qo-2^ogp{X,\e) ^-^ -Ee,—,logp{X\9) = m, 

i=l 

which generalizes directly to multivariate 0. 

To set up our first result, we need some notation. Let be a random 
realization of 0, <f>n = ^/nF^'^{6n){6 — On) and consider Johnson expanding 
the posterior distribution function W^((/)]X") of (pn- Johnson [11] obtained 
an expansion for Vl^((/>„]X") in terms of normal densities with polynomial 
factors when is one-dimensional. The expansion uses (n/((9„))-^ as the 
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empirical variance oi 9 — 9 and holds in an almost sure sense, for n > N^, 
where depends on the observed sample x = x". This is almost the ex- 
pansion we want. For our purpose, we set -0 = V'n = y/nI^/'^{6o){9 — 9n) for 
given 9n and denote the posterior distribution function of it by Wo{-\X'^). 
Writing the distribution of the d-dimensional standard normal A^(0, Id) as 
$(•), with density we have ^{^I^/'^^OoW - 9^)) = <^4,/-i(eo)/n(^) 

and (t>{^I^I^{9o){e-9n)) = |ri/(^o)r'/'<^e„,/-i(eo)/n(^)- Let w^\9) be the 
rth (vector) derivative of the prior density w{9), when it exists, and write 
-^'•(^) = ;i|7|TEr=i M^logP(-'^il'^) foi' a vector r = (ri, . . .,rd), where |r| = k 
means ri + ■ ■ ■ + = k, and for 9 = {9i, . . . , 9d) , 9''' means • • • 9^J^ . Exam- 
ination of [11] gives the following. 

Theorem 2.1. Suppose all derivatives of\ogp{-\9) of order J + 3 or less 
exist and are continuous and that all the derivatives \{d'^'^^ / d9''')\ogp{x\9)\, 
for \r\ < J + 3, are hounded in an open set containing 9o by a function G{x) 
with EG{X) finite. Suppose also that all derivatives of w up to order J + 1 
exist and are continuous in a neighborhood of9o. Then, for given 9(), there 
are a sequence of sets Sn with Pgg{Sn) = o(l), and an integer N, so that, for 

S Sn, Theorems 2.1, 3.1, 4.1, 5.1 and 5.2 of [11] continue to hold with 
W{(I)\X"-) replaced by Wo{i^\X"-) when n> N. That is, we have: 

(A) For the posterior distribution: 

J 

(2.1) 

n>N, X" G Sn, 

where C > is a constant, and the ^j{ip)^s are polynomials in i{j with bounded 
coefficients. 

(B) For posterior moments: For each integer i < K — 1, there are a se- 
quence of functions {Xij{X^)} , a constant C > and an integer Ni so that 



(2.2) 



J 



Ew4.\x-){Isj'^\Oo){9 - Lr) - E Xi,{X^) 



<^^-(J+l)/2 



n>Ni, 



on a set Sn{i) with PQ^{Sn{iY) 0, where Xij{X^) = for j odd, and for i 
even we have 

A,,(X") = 2'/2r((z + l)/2)/r(l/2), 
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while for i odd we have 

Xi,i+i{X^) = 2(*+i)/2(2(i + l)/3„(4)r((i + 4)/2) 

+ T{{i + 2)/2)«;« {en)/w{en))/T{l/2), 

all of which are bounded in X'"- . 

(C) For inverse quantiles: Let r]{S,) = {Wo{£,\X^)) be the transformed 
quantileofWoi-lX"^). Then 



(2.3) 



where C > is a constant, for some functions ujj{^) =u;j(^,X") that are 
polynomials in ^ with coefficients bounded for large enough n. 

(D) For posterior quantiles: For a solution r] = <I>^^(VFo(^(ry)|X"')), we 
have the following: 



(i) 



(2.4) 



where C > is a constant and the functions Tj{-) are polynomials in rj with 
bounded coefficients. 



(ii) If we set rj = ath percentile of ^, then 
J 



(2.5) 



Wo(^71 + J2n~^^\j{7^)\X^^ -a 



Remark. This collection of statements differs from Johnson's [11] re- 
sults because we observe it for general d-dimensional parameters, a single 
choice of N independent of the data string, and have replaced the empiri- 
cal Fisher information by its population value in the standardization of the 
MLE. Replacing the Nk^x's in [11] by a single fixed means we can only 
get a Johnson expansion valid for x"" in a set Sn with probability increasing 
as P9g{Sn) = 1 — o(l). To ensure P0^{S!^) = o(l), we will typically need laws 
of large numbers to hold for the J^'s occurring in the expansion; we assume 
these as needed. Faster rates for Pq^{S^) — > 0, for instance, Po^^{S!^) < e"""^ 
for 7 > 0, can be obtained by imposing moment generating function assump- 
tions to get a large deviations principle. 

Note that I{0o) is used in the standardization of the MLE, but the co- 
efficients in the expansion remain empirical. That is, the coefficients in the 
polynomials of the expansions are functions of the data, usually estimates 
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of population quantities of the form [11], equations (2.25) and (2.26). When 
it is important to replace these with differentiable quantities, as in the proof 
of Theorem 3.3, we will use approximations such as I{0) = I{6q) + Op(l); 
the Op(l) term in such approximations is what limits the accuracy of our 
expansions. 

Proof of Theorem 2.1. Proofs for (2.1)-(2.5) are ah modifications 
of the techniques in [11]. To demonstrate the modifications, consider (2.1). 
It will be enough to check the proof of Theorem 2.1 in [11] line by line. 

First, the main difference due to the dimensionality is that occurrences 
of powers {0 — O^i)^ in the one-dimensional case must be replaced by the 
multi-dimensional version, X]|r|=fc(^ ~ ^nY for a d-tuple nonnegative integer 
vector r. 

Johnson used bounds Nk^,j., A; = 1, . . . , 5, in his proof. The first two, Ni^^ 
and N2^x^ are used in his Lemmas 2.1 and 2.2, which are not needed in our 
case, since we are replacing I{6n) by I{Oq) (Note that in the statement of 
Lemma 2.2 in [11], /(xj, 6) in the denominator should be f{xi,9n)-) The next 
two, N^^x and N^^^, are from Lemmas 2.3 and 2.4. They arise from using 
the strong law of large numbers finitely many times to get inequalities. 
Denote the set on which the strong laws fail for a given n by S^. Then, 
the conclusions in Lemmas 2.3 and 2.4 hold for all x G Sn, and P{S^) = 
o(l). This property of the strong law holds even when I (9) is replaced by 
I{6q). Finally, N^^x > ]^4,x is used to allow the finite term approximations 
(2.21) and (2.22) to be used in the expansions (2.19) and (2.20). The sets of 
x^'s on which this fails have probability tending to zero. Thus, they can be 
put into too, and N can be chosen independent of x". □ 

It is seen from (2.1) that, for n > and X" G 5„, 
J 

for positive integers J, where the polynomials 7j(^) in -0 have finite coef- 
ficients. Note that 7j+i is not known to be of the form of the 7j's when 
j < J', it is only known to be bounded. The other expansions (2.2)~(2.4) 
give analogous statements. 

We formalize this class of posterior approximations in the following def- 
inition. First, we say that Pwix"') is a posterior derived object if and only 
of Pwix"^) is a function of the posterior distribution 14^(-|x"). Here, we have 
chosen VFo(-|X") as the form of the posterior for our work. The class of 
Pwix"^) does not matter, but the use of VF(-|x"') does. We rule out the 
appearance of parameters or their estimates apart from I{Oq). Thus, the 
posterior itself and a posterior quantile are both posterior derived objects. 
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Assumption JE. A posterior derived object PvFo(^") is Johnson ex- 
pandable of order J if and only if it has a Johnson expansion of the following 
form: There are an N and an Sn with Pq^^[S^) = o(l) so that, for n> N, we 
have 

J 



j=0 



C 

< 



(J+l)/2 ' 



n 



for some C > 0, where the 7j(x"')'s are any quantities that depend only 
on Woi-lx"^). 

We assume that all Assumption JE's are nontrivial, that is, the j = term 
is not Pwoix"^)- 

Next, we turn to the other asymptotic expansion assumption we will need. 
For the MLE On of 9 based on let = fni0\0) be the density 

function of 9n when 9 is the true value, and let gn{-) = gni'\9) be the density 
ofT = Tn = V^n/H9o){9n-9) given 9. (It is seen that T is a function of 9 
for fixed 9, whereas a„ is a function of 9 for given 9.) Observe that 

U9) = \nI{9o)\'/^gn{V^l'^H9o){9 - 9o)). 

So, to get an expansion for /„, it is enough to get one for gn- For later use, 
we record 

^Lj-Heo)/niG) = ^{V^l'^'{eo){e-9n)) 

and 

K,(n/W,))- (0) = \nIi9o)\'/^MV^l'^\Oo)i9 - 9o)). 

The expansion for gn will depend on the form of the MLE. For many 
parametric families, 9n can be expressed as 



s 



n . , 



for some s(-) and /i(-). Thus, as argued in [24], we often have 

gn{t) = Mt) + E n-'/^Pk{t) + o(n-^'/2)__± 
fc=i 

where the error o(n~^/^) is uniform over in a compact set and t = y/nI^/'^{9) x 
{9n — 9). The Pk{v)'s are polynomials given by 

<^d'WE^ E E TTrrTT 

9=1 ^' «i+-+«9=fe, |r,„|=U+2,(l<m.<g) ^' 

X (-l)l^il+-+l'-«li?'^i+-+^?(/.rf(-t;) 

in which x,., for a vector r, is the rth cumulant; see [2]. 
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Assumption EE. The Edgeworth expansion of order K for /«(•) in- 
duced from gni') is 



K 



fn{9) = K,inm))-^ (^) + E n-''^'P,{V^l'/\9o){e - eo))ct>e„iniieo))-^ W 



k=l 



i + \\^iy2^eo){9-eo) 



|X+2' 



when it exists, where ^ is a dummy variable varying over values of 9 and the 
error o{n^^^~'^'^^'^) is uniform for in a compact set. 

We comment that Yuan and Clarke [24] do not prove Assumption EE in 
full generality. They only establish uniformity for the density of the mean 
and for a certain restricted class of functions of the mean. However, the 
discussion in [24] suggests that Assumption EE holds in much greater gen- 
erality even though a formal proof does not yet exist. Indeed, when it fails, 
it seems to do so only on sets of very small probability which are enough 
to prevent the supremum from going to zero. Consequently, we suggest As- 
sumption EE is an acceptable hypothesis in a design setting where we are 
primarily interested in average behavior rather than worst case behavior. 

Note that Assumption EE permits us to take expectations over the pa- 
rameter space and the sample space because the approximation is uniformly 
good over both 9 and X". Indeed, Assumption EE immediately gives an 
expression for the mean of 9 because 

9fn{9)d9 = I \nI{eo)\'/^9MV^l'^H9o){9 - 9o))de 

K 

c 

k=l 



K 

(2.6) +Y,n-^'^ j \nI{eo)\'/^9Pk{V^iy^ieo){9-9o)) 



xMV^i'^\9o)i9-eo))de 

^ V l + ||V^/V2(0o)(^-^o)||^ 



9o + y u(j)d{u)du 
J 



k=l 



+ E / (^0 + u/{V^\I{eo)\"^))Pk{u)Mn) du 



+ o{n I ) du. 

J 1+ M 
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J 
k=l 

k=l 

where Pk{(y) and Pi^k{(^) are the expectations of Pk{u) and uPk{u). The ar- 
gument o" signifies that powers n"^ are replaced by cr^'s, the mth moments 
of A^(0, 1). To see this, suppose Z = {Zi, . . . , Z^) ~ N{0, I^) and that the ith 
term in Pk{u) has the form aiUi ■ ■ ■ uj'. Then the term in its expectation is 
Oj / u{ui • • • uj')(pd{u) du, which equals aiE{Zl^~^^ Z^^ ■ ■ ■ Z^J^, . . . , Z^^ ' ' ' — 
aiioi^j^xOi^ • • • (Tj^, . . . , CTjj • • • (Tj^+i), a vector with entries in which the powers 
of Ui correspond to standard normal moments. 

Recall, our goal is to derive asymptotically, for pre-specified e > and F ^ 
the minimal sample size n to achieve 

(2.7) Ee,F{W{-\X^))<e, 

where the expectation is with respect to the density p(x"|0o)- Our main 
approach to (2.7) rests on the following general procedure for the compu- 
tation of the asymptotic expected behavior of functionals of the posterior 
distribution. As indicated in the Introduction, let 

(2.8) = F{W{.\Xn) - i^(^4,(n/(eo))- (•))' 

where, under 6q, On is distributed as in Assumption EE, and we have done the 
standardization in the limiting normal rather than in the nonstandardized 
posterior VF(-|A:") for 0. 

Proposition 2.1. Functionals of the posterior distribution function 
VF(-|X") satisfy the following: 

(i) If F{^^ (^ni{eo))~^i^)) independent of z, then if Assumption JE 
holds for some J > 1, we have 

(2.9) Ee,F{W{e\X^)) = F(a>o,(„,(,„))-i {6)) + Ee.Rn- 

(ii) // Assumption EE holds for some K >1, we have that 
Ee,F{W{e\X'')) = Ee,F{^{Z + V^I^/\9o){9 - Oo))) 

K 

(2.10) + n-^'^Ee,F{^{z + v^i^'\eo){e - eo)))Pk{Z) 

k=l 

+ o{n-''/^)h{n) + Ee.Rn, 
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where the first expectation on the right-hand side is with respect to Z 
N{0,ld), and 

Fmz + VnI^9o){9-eo))) 
l + \\z\ 



h{n)= I -, , 11,1,;^ dz. 



Remark 1. In settings where our theorems for special cases do not ap- 
ply, we can often obtain results by use of (2.10). This will be seen in Section 4. 
Moreover, it is seen that h is integrable when F(^>(Z + y/nI^/'^{eo){d - 6*0))) 
is. 

Proof of Proposition 2.1. Assumption JE gives that Wo{-\X^) is 
approximated by ^oj^{-), or is approximated by [ni{eo))-'^^'^' 

Thus, the functional can be written as 

Taking expectations in 9q and using Assumption EE gives 

K 

+ ^ n-'/^p,{V^i'/\9o){u - eomoxnm))-^ (^) 

k=l 

\nI{9o)\y^ 

i + \\^n/2{eo){u-9o 



F{^{z + y/7il^'\9^){9 - 9o)))<l,d{z) dz 



+ En"'/' / F{^{z + V^I^I\9^){9-9,)))Pk{z)Uz)dz 
k=i 



In examples we will see that o{n~^/^)h{n) is often of lower order than 
EFmz + ^n/\9o){9 - 9o))). Also, we observe the heuristic approxima- 
tion 

E[Fmz + V^l'/\9o){9 - 9o)))Pk{Z)] 
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~ E[F{^{z + V^i^i\e^){e - e^)))]E[Pu{z)] 

= E[F{^{Z + V^/i/2(^o)(^ - Oo)))\Pk{a), 

where Z is a N(0,ld) random vector, and Pk{cr) is the expectation of Pk{z) 
with powers replaced by cr;, the lih. moment of N{0,lfi). Taken together, 
these heuristics suggest that in many cases (2.10) gives 

K 

Ee,Fiw{e\x^)) = EFi^z + v^i'/\eo){e - 60))) + 0{n-^'^) + 0(1). 

k=l 

3. Asymptotics for expected values of functionals. Proposition 2.1 was 
of general applicability. However, there are commonly occurring functionals 
that are worth examining in detail. When they depend on Johnson expand- 
able quantities such as those in Theorem 2.1, we have a X-term expansion 
in powers of n"-'/^ on the "good" sets 5„. However, the coefficients depend 
on X". This is a problem because we want to take the expectation over 
the sample space for a functional of the posterior distribution. To get a 
closed form for these expectations, we must replace the empirical quantities 
in the coefficients in the expansion by their theoretical ones. Unfortunately, 
as noted in the remark after Theorem 2.1, such approximations are only ac- 
curate to order Op(l) unless more stringent hypotheses are proposed. Such 
hypotheses are hard to determine in part because the forms of the coeffi- 
cients are generally unknown. Moreover, a posterior quantity must depend 
on the data, so replacing all the estimates with population values, if it could 
be done, defeats the purpose of using them. This is especially problematic 
when our goal is to obtain sample sizes. A final caveat is that we have tacitly 
been assuming that the expectation over the "bad" set S*^ will typically be 
small compared to that over the "good" set Sn, as noted in the Remark 
after Theorem 2.1, but we do not have a general closed form expression for 
it. 

Taken together, these considerations mean we will only get a two-term 
expansion for the expectation, plus a remainder term 

B!^ = Ee,{F{W{-\X'-)Is^J, 

which we have argued is asymptotically small enough, relative to the main 
approximation, that we can neglect it. 

Theorems 3.1 and 3.2 below are extensions of results in [11], in which we 
have left the dimension of the parameter d = 1; cases with d > 2 are similar. 
Theorem 3.3 is more novel. 

Let 6 be the posterior mean which often has the form 9 = s{{l/n) ^ h[Xi)) + 
Op{l/n). We use this in the first theorem because it is the right centering 
for posterior moments and is very close to the MLE. Note that in general 
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we need to specify an estimator for planning purposes and that consistency 
of the MLE generally ensures that Bayes estimators are consistent; see [22]. 
Our first result is the following. 

Theorem 3.1. Make all the assumptions in Section 2, in particular, 
those for Theorem 2.1. Also, assume Assumption EE for 6 in place of 6. 
Suppose J \6\^w{9) d6 < oo and choose K,J>r. Then, 

(3.1) Eg,Ew^^.\xn) [{9 - 4)1 = /-^•/2(0o)A..n-'-/2 + o{n-^/') + 
where \„ = 2''/'^T{{r + l)/2)/r(l/2). 

Remark. In this case, the concern about using an approximation like 
r/2(4) = r/2(6io)(l + o(l)) for ? = 1, . . . ,r is built into Theorem 2.1: The 
scaling in the posterior by I{Oq) and the laws of large number that are in- 
voked to get {Sn) are enough for the expansions of posterior moments 
and percentiles. 

Proof of Theorem 3.1. Let Vn = y/nI^/'^{9Q){dn - Oq). By Assump- 
tion EE for Vn-, its density is 

9n{v) = Mv) + E n-''/^Pk{v)Mv) + o(n-^/') 

k=l " " 

So we have 

K 

EV: = / v'^gniv) dv = a, + E n'^/^ Pr,k{<T) + o(n-^/2), 

k=i 

where a is the vector of central moments from a N{0,ld) as in (2.6) and 
the o(n~^/^) comes from o(n~^/^) / v"^ /{I + ||?;||(^+^)) di). The integration 
is finite since K >r. 

By using Assumption EE for both On and 4, we have 

Ee,{On - On) = r^'\Oo)n-^/^Ee,{an - K) 

= r^i\Oo)n-'i^Y.iP^A'^) - hA^))^^''' + o{n-^/') 



k=l 



0{n-'), 



where a„ = s/nI^/'^{9 — Oq), the Pi^fc((T)'s are defined after (2.6), and the 
-Pi,fc(f)'s are their counterparts in the expansion for fvni')- general, for 
m = 1, . . . , r, we have 

(3.2) Eg^ien-en)^ = 0{n-^^+'y^). 
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Note EggEw,{.\x^^){0 - Onf = EggEw,{.\x^^){IsA^ - OnY) + R'n, and we 
only need to deal with the first of these terms. We omit the indicator Is„ 
for simplicity. 

Assumption JE is satisfied by use of expression (2.2) in Theorem 2.1. 
Thus, for i = 1, . . . ,r we have 

(3.3) £;^„(.|^„)(r/2(^o)(^ - ^n)^) = E A,,(X")n-^V2 + q(^-{j+i)/2)^ 

j=i 

on 01=1 Sn{i) for > maX''^;^ Ni, where the O(-) is independent of X". 

Now we can deal with the expectations £'6io£^t^^(.|X")/*^^(^o)(^ — GnY, for 
i = 1, . . . ,r. Let C(r, i) be the combination number of subsets of size i from 
a set of size r. By (3.2) and (3.3), we have 

= -£'6»o-E^Wo(-|X")((^ - ^n) + (On - 9n)Y 

= {m)r-/^Ee,Ew^(.\x^^){r'\k){e - OnY) 
+ j2c{r,i)r^/\eo) 

i=l 

X £;,Jl('--)/2(0o)(^„ - 9nY-'Ew4-\X'^){P/H0o){0 - OnY)] 
= /-^/2(0o)A..n-^/' + O(n-('-+i)/2) 

r 
i=l 




Now that we have an asymptotic form for functionals based on posterior 
moments, we turn to percentiles. Our result is the following. 

Theorem 3.2. Make all the assumptions of Theorem 2.1 for some J > 
1, and assume Assumption EE for some K >1. Let VF~"^(a|X") be the ath 
quantile of W{-\X'^) . Then we have 

(3.4) Ee,W-\a\X^'') = 60 + n-^'^r^/\e^)^-\a) + o{n-^'^) + E!{n). 

Proof. Let be the ath quantile of = VriI^/'^{9Q){6 - 9n)- That is, 
a = Wo{iP < = W{9 < n-^/^r^/\9o)i^ + 9n\X'-). 
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So, we get 

where Un = ^I^/HSo){0n - 0o). 

There is a function C = ^iv) which for any r/ is a solution to = 
Wo{ar])\X'^). So, given (^a, we can backform to an r]a by defining the func- 
tion ^(•) to satisfy S,{Va) = S,a- Using this in (2.4) from Theorem 2.1, we get 
that ^,nici) satisfies Assumption JE, which we write as 

J+i 

C{r]a) =i]a + J2 ^i{ila)n-^'^, n>N,X''e Sn, 
i=i 

where rj+i(a) is the 0(n^("^+^)/^) remainder term, which is bounded in 
absolute value (a.s.). Using this in (3.5), we get 



/ J+i 



(3.6) 

+ eo + n-^'^r^'\e^)Un, 

forn>iV and X" G S^. 

By Assumption EE, we have 

K 

(3.7) Ee,Un = cTi + n-'=/2Pi,fe(a) + o(n-^/2), 

k=l 

in which we see cti is the first moment of A^(0, 1) and so is 0. Also, we have 
HVc) = Wo{aric.)\X^) = WoiUX"") = 

so <l>~^(a) = rja- 

Finally, since E0^^W~^{a\X'^) = E0^^Is„Wn^{a\X^) + we can take the 
expectation in (3.6), use (3.7), note that the Tj{rja)'s are bounded in X", 
collect terms and substitute for r]a to obtain 

Ee.W-^alX'') = 60 + n-^/^I-'/\9o)<^>~\a) + o{n~'/^) + R'^. □ 

Next, we turn to derivatives of the posterior and more general posterior 
expectations. Denote the r = (ri, . . . ,r^)th derivative of VK(6'|X") at 6 by 

w(^\e\x-)= wi6\x-). 

To express the first terms in the expansion for the expectation of a deriva- 
tive of a posterior distribution, we need to define two sets of polynomials 
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that arise when we differentiate expressions involving the normal density. 
The first is the set of Hermite polynomials: For a vector i of length d, let 
Hi(-) be the ith Hermite polynomial defined by Ho{v) = 1 when i = and 
by 

when i^O. The second set of polynomials is particular to the use of Assump- 
tion JE f( 
given by 



(r) 

tion JE for the posterior distribution. We define rjj (•) to be the polynomial 



[<A(/1/2 {eo)vh, {OoM = 7?f {v)<P{l'/^ {9o)v) . 

When we need to take expectations in the standard normal of products 
of polynomials P{u) and Q{u), we denote the polynomial of the normal 
moments by P o Q. That is, EP{u)Q{u) ^ P{a)Q{a), but EP{u)Q{u) is 
a polynomial in a which we denote P o Q. In this notation, we have the 
following. 

Theorem 3.3. Assume Assumptions JE and EE for some J = K >1, 
and that VF(0|X'") has r= (ri, . . . ,rrf)i/i derivative at 6q with miujrj > 1. 
Then 

(3.8) 

+ yl^n(H-i)/2 + o(^(M-i)/2) + ^;^ 

where r — 1 = (ri — 1, . . . , — 1), Hr-i{^) is the expectation of Hr~.i{v) 
with powers v'^ = v^^ ■ ■ ■ v^'' replaced by Osj (\/2 )'^' and 

Proof. See the Appendix. □ 

If we set r = (1, . . . , 1) in Theorem 3.3, we get the posterior density. In 
fact, we can get the result for any partial derivative without the restriction 
min{ri, . . . , r,^} > 1, by a similar technique. However, the computation of 
the coefficients becomes more involved. Also, in the Appendix we develop 
an asymptotic expansion for 



^eu(^y h{e)w(B\X'^)dd 

where /i is a specified differentiable function; see (A. 13). Such expansions 
may be helpful in sample size criteria derived from hypothesis testing opti- 
mality. 
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4. Special cases. Here, we examine four functionals encapsulating differ- 
ent sample size criteria taken from [23]. It will be seen that Proposition 2.1 
and the results from Section 3 can be used to obtain closed form expressions 
for Bayesian sample sizes. To avoid repetition, we assume all the required 
conditions on the models are satisfied and just derive the corresponding 
formulae. 

Example 1. For the criterion APVC in [23], set 
F(Ty(-|X")) = Var(e|X") 

= J e'ewideix"")- (^j ew{d9\x'')^ (^j ew{de\x'')^. 

By Theorem 3.1, 

Ee,F{W{-\X^)) = rHeo)X22n~' + o(n-i) + <, 

in which A22 = 2r(l + l/2)/r(l/2) = 1, since r(l + 1/2) = l/2r(l/2). Typ- 
ically, R'j^ will be of smaller order than the main term, so for 9 A with 
infggyi 1-^(^)1 > 0, and prespecified e > 0, the smallest sample size to achieve 

|Ee„Var(G|X'^)| <e 

is approximately given by 
(APVC) n > ^ 



A direct approach to this result by evaluating the terms in Proposition 2.1 
can be done but seems to be quite involved. 

Example 2. For the criterion ACC in [23], set F(VF(-|X")) = /^^ W{d9\ 
X"'), in which Dn is the HPD interval with length I under the posterior 
distribution W{9\X^) and suppose 6 is unidimensional. Unfortunately, our 
results in Section 3 do not apply, because, like the quantile example in the 
Introduction, the functional F would have to depend on more than just the 
posterior. 

However, we can still evaluate the terms in Proposition 2.1. The first term 
on the right-hand side of (2.6) is 

EFmz + V^I^/^{9o){--9o))) 

(4 2) =^/^£!^iME f e^(m{z+vrd^/\eo){e-eo))^ ^0 

Jd' 



^E f e-(V2)n/(eo)(e-eo+z/V«-f(<'o) f 



Jd 



BAYESIAN SAMPLE SIZE 21 

From this, we see that D'^^ is of the form 

D'^ = [9o - n~^/^r^'\e^)Z - 1/2, do - n-'/^r^l\9o)Z + 1/2], 

which is the HP D inter val for 9 u nder $ (Z + ^/i/2(6'o)(- - 9q)) of 
leng th I. Le t 77 = ^nI{ 9Q){9- 9q + z/y/nI{9Q)). Then r/ ~ Af(0, 1) and D'^ = 
[-y/nI{9o)l/2 <r]< y/nI{9o)l/2], so the right-hand side of (4.2) is 

2-K J Jd'^ 



2vr J J[-^nI{eo)l/2<v<^nI{eo)l/2] 

{2^{^^)l/2)-l)^ I e-(V2)^'dz 



. 2tt 

= 2^^nl {9^)1/2) -I. 

As n — > 00 this term tends to 1. 

For large n, D„, is of the form [9^ ± //2], where 9^ is the posterior mean, 
and 9n 9q in Pq^ probability. Also, we see that Fmz + ^n/^9o){--9o)) 
is in fact independent of Z. Now, we have that 

W{[9n±l/2]\X^)^1 

and 

Fmz + V^l'/\9o)i- - 9o))) = ^>o,(„/(,o))-i([±V2]) - 1, 
also in Pg^ probability. So, by the dominated convergence theorem, we have 
Eg,R^ = Ee,{W{[9r, ± l/2]\X^) - '^z,inm))-^i[Z ± 1/2])) ^ 0. 

In the decomposition from Proposition 2.1(i), we see that (4.2) is the 
leading term and the other terms tend to zero. So, for given < a < 1, the 
minimal n to achieve 

Eg,FiWi-\X'')) = Be, I W{d9\X'^) > 1 - a 
is approximately given by 



2^{^nI{9Q)l/2)-l>l-a. 
Equivalently, for 9 ^ A with inffiig^/(0) > 0, we have 

a 



(ACC) n > ^ 



where <5(-) is the distribution function of A^(0, 1) and <I>~^(-) its inverse. 
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Example 3. For the criterion ALC in [23], take = W~^j^„{l- 

a/2) — Wg^^n (o/2), that is, suppose we require that the symmetric posterior 
quantiles be less than / apart. 

By Theorem 3.2, 

Ee,F{W{-\X^)) = ^^($-1(1 - a/2) - ^-Ha/2)) + o{n-'/'). 

So, for 9 £ A with infeg^/(0) > 0, and given length /, the minimal n to 
achieve 

Ee,{W-^^„il - a/2) - W,ji„(a/2)) < / 
is approximately given by 

(ALC) n > ^ 



/2inf,e^/(0)(c^-i(l - a/2) - <^>-\a/2))^ ' 
Again, for completeness, we evaluate the terms in Proposition 2.1 directly. 
Let *^z,(n/(9o))-^ (■) distribution function of 0z,(n/(eo))~^ (") given 

Z and suppose 6 is unidimensional. It is straightforward to see that 

*iU.))-.(°/2) = 2 + ;;^*-V/2). 

So, the first term in (2.10) is 

Ee,Fmz + v^i'/\eo)ie-eo))) 

= -E'6lo-P(^Z,(n/(6»o))-i(")) 



,(n W)- (1 - - Wo))- 



^ -.{'^~\l-a/2)-<^~Ha/2)), 



yMW) 

as obtained above from Theorem 3.2. 

Next, we deal with the remainder term in (2.6). In fact, it is enough to use 
(1.1), the two-term version of (2.6) avoiding nontrivial expansions entirely. 
Since we have 



we must have 



w{V^i'/\eo){e - 6n)\x^) 4 n{o, i^), 
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V < a < 1 . Equivalently, 
So, we obtain 

- '^z\nm))-^ =On-Z + Op(n-V2). 

Since E{Z) = Oq, we can use Assumption EE to get 
EeM = eo + n-'/^r'/^{9o)Ee,{V^l'^HOn - ^o)) 

= e, + n-'/'r'/\eo) 

X (^J v(l)d{v)dv 



K 

k=l 



K 

+ ^ n'^/^ / vPk{v)(t)d{v) dv + o(r 



1 + 



= 0o + O(n^i). 
Hence, with mild abuse of notation, 

Ee,Rn = Ee,{F{W{-\X^))-F{^z,(nm))-^{-))) 
= Ee,{o^{n-^'^)) = o{n~^l^). 

Example 4. For the effect size problem in [23], take F{W{-\X^)) = 
le^ W{d9\X^). Here 9i < 9q and 9i < 9n for large n. Our theorems do not 
apply, so we use Proposition 2.1. This gives 

EF{<^{Z + V^I^'\9^){- - 9q))) 

^^f VriI{9o) f'^ ^-{i/2){z+,/;j{eo)ie-eo)f 
V \/2^ Je^ J 

(4.3) = vyiy J g-(n/(0o)/4)(e-eo)2 ^-{z+{^/2)i^/\eo){e~eo)r 



oo 



V2 



(l/2)(n/{0o)/2)(e-eo)2 



1 _ $ 



71 - PO, 



We see that (4.3) goes to 1 as n increases (since 9i < 9q). We show that the 
other terms are 0{n~^^^), so that (4.3) is the leading term. 
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In fact, since 



V27r J6»i 



= l-$(Z + ^n/(^o)(^i-^o)), 
which is bounded, for J > 1 we have that 

n-^I^E[F{^{Z + ^I^'\e^){- - e^)))P,{Z)] + o{n-^'^)h{n) 

J 

= E n~^/'^0{l)EPj{Z) + o{n~^''^)h{n) = 0{n^^l'^), 
i=i 

since the EPj{Z)s are finite and h{n) = o(l) by a similar evaluation as in 

(4.3) . 

For the remainder term, as in the proofs of the theorems, we only consider 
the "good" sets, omitting indicators on them. We have 

rca 

Ee,Rn = Ee, / d{Wi9\X^) - 

= Eeo rij:n~^"n'"\l''\9o)\ 

(4.4) X ci)d{V^i^'\do){e - en)nj{V^i'^HOo){o - k)) 

= Eg,r „ fEn-^/Vd(t;)7i(^)+n-(^+i)/^y|i(r;)')d^;. 

Since each term in (4.4) is integrable, expression (4.4) is bounded in ab- 
solute value by 

^f;n-^V2^,(7;)|7il(^;)+n-(^+i)/2|7(,^^,|(t;)^ dv = 0(n~^l''), 

where, for a polynomial -P(-), |-P|(w) is P{y) with the coefficients and powers 
replaced by their absolute values. 

So, for (4.3), for 6* e A = [a, h\ with inf^g^ 1[Q) > 0, and given < a < 1, 
the minimal n achieving 



Ee, / iy(d0|X") > 1 - a 
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is approximated by 
which gives 

(ES) n > 



mieeA{9i-eyi{e)- 



5. Comparisons with exact results and numerical evaluations. In this 
section we present some closed form expressions for the sample size criteria 
we have evaluated asymptotically. Then we turn to some numerical work. 
Both types of material suggest our asymptotic approximations are reason- 
able. 

5.1. Exact results. In the case of the normal density with a conjugate 
normal prior we can obtain exact expressions from direct calculation for all 
four criteria we studied in Section 4. It is seen that our asymptotic expres- 
sions match these up to the stated error terms. More generally, only the 
(APVC) criterion, arguably the most popular of the four we have examined, 
can be calculated explicitly. We present two more examples, the Poisson(0) 
with a Gamma(a,fe) prior and the Binomial(^) with a Uniform([0, 1]) prior. 
Again, it is seen that our asymptotic expressions match the direct calculation 
expressions up to the stated order of error. 

To begin the normal case, we record that, for X\6 ~ N{6,aQ) and 9 ~ 
N{no,Ti), we get /(^o) = 'jf and W{e\X^) = Ni9n,al) with 



Next we go through the four criteria in turn. 

For the (APVC), the exact quantity is 

£;,,(Var(^|X")) = Var(0|X") = /° , = ^ - 

If we choose r = 2, we have A22 = 2r(l + l/2)/r(l/2) = 1, so by Theorem 3.1, 
we get 

2 

Ee,{^^T{e\X^)) = ^ + o(n-i) + <, 

which matches up to the stated error. 
For the (ALC), let Zn ~ iV(6'„,cj2). Then 



X'' 
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SO 

a-\W-\a\X'')-en) = ^~\a) or W-\a\X'') =0^ + an^-\a). 
Since X ~ N{9,aQ/n), we have 

ii-eo^^ ) = ~r~^ — 277 — 2^ ^ , (a) 



1 + (T2/(nr2) y^n + ,j2/r2 
+ ^ci,-i(a) + o(n-i/2). 



^0 



By Theorem 3.2, we have 



Ee,W-\a\X^) = 60 + + o{n-'/^) + < 



matching up to the stated error. 

For the (ACC), we have = [6^ - 1/2, 9^ + 1/2], and 



Ee,W{[dn±l/2]\X^) = Eg, f _^^^^-{^l^^-lW~e^f 

1 -{l/(2.^))a2^^ 



2$(f7„//2) -1 = 2$ 
^2 



2$(v^l)- 1 + 0(1). 



From Example 3, we have 

EeJ t^(d0|X")d = 2$f^^)- 1 + 0(1), 

matching up to the stated error. 

For the effect size criterion, let Q\ < 9q. So, for large n. On — Oi > e > 
(a.s.), cj;^i = 0(ni/2) and 0„-X=O(n"i), giving a~^{en-X)=0{n~^/^). 
Using this in the functional gives 

roo 

Eg, / wide\x^) 

J0i 

J 61 v27r(T„ 

roo 1 



'00 
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a-\ei-x) V2vr 



^° Ja~'^{ei-X) \/27r 



1 e-"'/2da + 0(n-i/2). 



Since 



n 



0{n 



-1/2N 



so {9i — X) — y/n/ao{6i — X) = 0{n ' ) (a.s.), the last expression for 
the functional is 



E, 



'00 



1 



V^/aoiBi'X) V27r 
+ ^00 



e~^'l^da 



V^/aoi0i^X) I 

e-^/^da 

a-'{ei~x) 



Eeo 



2tt 
1 



+ 0(n 



e-"'/2rfa + 0(n-^/") + 0(n 



y^/uoi0i-x) V2vr 

oo roo 



l/2^ 



00 v2vr Co 

°° 1 ^ 

1 \/47r ctq 



1 \/^^-n(e-a:)V(2a2) ^ ^ 



2tt CFq 



oo a/vT CTq 



+ 0(n-i/2) 

1 \/^„-n. 



47r (Jo 



(e-eo)V(4<.o')rf0 + O(n-i/2 



1 e-'/2da + 0(n-i/2) 
(v^/(V2ao)){ei-eo) v27r 



+ 0(n 



-1/2^ 



2 (JO 

From Example 4, we have that 

Ee„ / = 1 - CD 



/n - t/o 



+ 0(1), 



2 do 

again matching up to the stated error. In this case, the exact expression 
gave slightly stronger control of the error. 

Next, we turn to two other examples for the (APVC). Of the four criteria, 
only the (APVC) is simple enough that it can be obtained in closed form in 
some cases. 
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Let X\6 ~ Poisson(0), and suppose 9 ^ G{a,b), the Gamma distribution 
with a, b known. Let Sn = J2i=i -^i- Then, by standard results, ~ G(a + 
n, b + Sn), with £;(0|X") = (6 + Sn)/{a + n) , Var(0|X") = (6 + S„)/(a + nf, 
I{9o) = l/eo, andEe,{Sn) = neo. 

So, the expected posterior variance is 

Eg^^{Va.r{9\X )) = - — ■ — = h 



(a + n)2 n (n + a)^ n(n + a)' 
and by Theorem 3.1, the approximation is 

Ee,{Vav{9\X^)) = ^ + o(n-i) + <. 
n 

As in the normal case, the two match up to the stated error. 

Now, let X\9 ~ Binomial(6') with 9 ~ U{Q, 1). Setting Sn = ELi Xi, stan- 
dard results give that 6'|X" ~ Beta(S'.„ + l,n + 1 - 5„), with £;(6'|X") = 
{Sn + l)/(n + 2), Var(e|X") = (nS„ - 5^ + n + l)/[(n + 2)2(n + 3)], /(^o) = 
l/[0o(l - ^o)], Ee.iSn) = n9o and ^.^(S^) = n9o{l - 9o) + n^9l 

The expected posterior variance is 



Eg,(Yavi9\X^)) 

_ n^9o - n9o - n{n - 1)9^ + n + 1 
~ (n + 2)2(n + 3) 

_ 9o{l-9o) 39o{l-9o) l-9o{l-9o) 29o{l-9o) + l 
n n(n + 3) (n+2)(n + 3) (n + 2)2(n + 3)' 

By Theorem 3.1 our approximation is 

Ee, (Var(^|X")) = + o(n-i) + <. 

As before, the two agree. 

The agreement between the asymptotics and the closed form expressions 
suggests that in the other examples the discrepancy between the two will be 
small. Indeed, all of the criteria are derived from posteriors and posterior 
objects which can be approximated as well as desired by taking enough terms 
in the expansions. That is, optimizing the asymptotic expression obtained 
by using more terms will give any desired degree of accuracy. We suggest this 
will only be needed in extreme cases when the coefficients in the neglected 
higher-order terms are so large, possibly because of the range of the set in 
the parameter space, that they overwhelm the lower-order terms. 

5.2. Numerical evaluations. Fundamentally, the class of quantities we 
want to evaluate is of the form G = EgF^{W {■\X'^)) , where F represents the 
inference objective and e summarizes how well it must be met. To begin, we 
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Table 1 

Exact vs. asymptotic: Normal-Normal 



Parameter n (APVC): G, G' (ALC): G, G* (ACC): G, G* 

7)1 10 0.0187(0.0200) 0.2591(0.2674) 0.1449(0.1403) 

30 0.0065(0.0067) 0.3617(0.3657) 0.2431(0.2405) 

50 0.0039(0.0040) 0.3934(0.3960) 0.3093(0.3074) 

100 0.0020(0.0020) 0.4250(0.4264) 0.4251(0.4238) 

772 10 0.2308(0.2500) 4.0944(4.1776) 0.3972(0.3829) 

30 0.0811(0.0833) 4.4911(4.5252) 0.6200(0.6135) 

50 0.0492(0.0500) 4.6106(4.6322) 0.74040(0.7364) 

100 0.0248(0.0250) 4.7286(4.7399) 0.8877(0.8862) 

r]3 10 1.6071 (1.8000) 22.3791(22.7932) 0.6759(0.6485) 

30 0.5769(0.6000) 23.5583(23.7259) 0.9002(0.8934) 

50 0.3516(0.3600) 23.9075(24.0131) 0.9650(0.9628) 

100 0.1779(0.1800) 24.2470(24.3022) 0.9970(0.9968) 



present computations for two simple cases in which G can be obtained from 
the closed form expressions in Section 5.1. We compare selected values of G 
with the corresponding approximations G* from our asymptotic formulae. 
We look at expected values of functionals, rather than fix e's and find optimal 
sample sizes, to make it easy to compare these first two simple cases with a 
more complicated third case. 

Table 1 gives the exact G and approximate G* (in brackets) numerical re- 
sults for the normal likelihood and normal prior example given in Section 5.1. 
We have set r] = (^o, /"o, ^q, Tq ) and chosen 771 = (0.5,0.25,0.20,0.30), 772 = 
(5.0,3.5,2.5,3.0) and 7^3 = (25,20,18,15); the values of n are as indicated. 
The confidence level for (ALC) is a = 0.05; for (ACC), we set / = 6*0/10. 
(We omitted results for the effect size problem because the exact and the 
approximate quantities have the same first-order term and the higher-order 
terms are hard to get explicitly.) 

It is seen that as n increases the values of the (APVC) functional decrease, 
while the values for (ALC) and (ACC) increase. This is expected from the 
interpretation of the functionals. For each choice of 77 and criterion, it is 
seen that the error decreases as n increases; that is, the difference between 
G and G* gets smaller as n gets larger. It is important to note that, as the 
numerical value of G changes, it is closely tracked by our approximation. 

Less routine examples are the Poisson(^) likelihood with a Gamma(a, 
h) prior and a binomial {0) likelihood with a Uniform[0, 1] prior. For the 
Poisson-Gamma, we set r/ = {9Q,a,b) and for the Binomial-Uniform we set 
r] = eo. 

Table 2 shows the values for (APVC) from G and G* for rji = (0.5, 2.5, 3.5), 
rj2 = (1.6,8,7.5) and 773 = (1.5,10,12). For the Binomial-Uniform, we set 
rji = 0.20, r/2 = 0.5 and % = 0.75. 
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As in Table 1, both the error of approximation and the numerical values 
decrease as n increases for both prior likelihood pairs. For the Poisson- 
Gamma case, it is seen that the values for 772 and 773 are close because their 
0's are close. The prior has a smaller effect. For the Binomial-Uniform with 
constant prior, it is seen that the symmetry of the Binomial makes the values 
for rji and rj2 close. 

Next, we turn to an example in which a closed form for G does not 
exist. We will approximate G by G obtained from simulations and compare 
this to G* again obtained from our asymptotic expressions. To clarify the 
comparison in Table 3, observe that, in a world of infinite resources, we 
would generate m IID X"'s from pg, find VF(-|X" = x") for each of the 
x]'s, evaluate G{e,e,W,n,m) = {l/m)J2T=iPeiW{-\X'' = xj)) and report 

G = G{9,e,W,n,m) as an approximation to G = G{9,e,W,n). Ideally, we 
would use a large enough m that dependence on it could be neglected and 
W would be replaced by the hyperparameters, say, a, that specify it. That 
is, we will have 

(5.1) G{6,e,a,n,m) K G{6,e,a,n), 

so we can obtain minimizing values of n = n{9, e, a) from G. In fact, we want 
a maximin solution 

(5.2) UMmi^) = max n(6',e,a), 

in which K and A are compact sets. However, direct evaluation of riMmi^) 
is computationally demanding: It requires, for each specified e, 6 and a, 
evaluating EgF^iW {-IX^)) for many values of n so one can select the smallest 
n that satisfies the criterion. 

As in the first two cases, rather than evaluating (5.2), we compute, for 
some choices of n, the empirical posterior functional G{9,e,a,n,m), which 



Table 2 

Exact vs. asymptotic: Non-Normal 




Poisson-Gamma 

?7i 0.0544(0.0500) 0.0175(0.0167) 0.0103(0.0100) 0.0051(0.0050) 

r?2 0.0725(0.1600) 0.0384(0.0533) 0.0260(0.0320) 0.0144(0.0160) 

m 0.0675(0.1500) 0.0356(0.0500) 0.0242(0.0300) 0.0134(0.0150) 

Binomial-Uniform 

T]i 0.0136(0.0160) 0.0050(0.0053) 0.0031(0.0032) 0.0016(0.0016) 

r]2 0.0179(0.0250) 0.0073(0.0083) 0.0046(0.0050) 0.0024(0.0025) 

r?3 0.0149(0.0188) 0.0057(0.0063) 0.0036(0.0038) 0.0018(0.0019) 
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can be regarded as a good enough approximation to G{9,e,a,n) for large 
m. We also compute our asymptotic approximation, G* . In effect, we have 
assumed (5.1) by choosing m large enough and then compared G{9,£,a,n) 
to G*{6,e,a,n). Thus, Table 3 gives G* and G for several choices of 9, e, a 
and n, for various functionals F. 

Our argument is that the approximations G* are close to the correspond- 
ing G's for a variety of points {6,£,a,n) and, therefore, it is reasonable to 
use sample sizes obtained from G* as approximations to the sample sizes 
one would get from optimizing G directly. The values given for the G and 
G* given in the tables support this contention. 

Thus, we evaluated a nonconjugate, nonclosed form example. In this case, 
the G could not be found as in Section 5.1; we are forced to use G. To provide 
a real test of the asymptotics, take the likelihood to be Exponentials; |0) = 
6le-^^' with a Beta(3/2, 3/2) prior having density /3(6l|3/2, 3/2) oc ^e{l-e) 
on [0, 1]. It is seen that this example is far from the normal prior, normal 
likelihood setting, so its relation to the asymptotic normality used to derive 
our expressions is not close. 

Since G is an expected value of a functional of the posterior, we gener- 
ate m = 1000 IID data sets of size n for several values of n, , . . . 
from an Exponential(a;|0). For each X", j = 1, ... ,111, we draw outcomes 
from VF(-|X") by Markov chain Monte Carlo, compute F{W{-\Xj-)) from 
the empirical posterior distribution, and approximate EgF{W{-\X'^)) by 
{l/m)ET=iHW{-\Xf)). 

For several values of 9 taken as true, potential sample size, and 

each of three criteria, we give the empirical value, G, and its asymptotic 
approximation using our technique G* in brackets in Table 3. The expected 



Table 3 

Empirical vs. asymptotic: Non-Normal 



00 n Beo(Var(0|X")) 



i;eo(HPD) 



EeoiALC) 



0.25 



0.50 



0.75 



10 
30 
50 

100 
10 
30 
50 

100 
10 
30 
50 

100 



0.0116 (0.0062) 
0.0031 (0.0021) 
0.0018 (0.0012) 
0.0008 (0.0006) 
0.0238 (0.0250) 
0.0107 (0.0083) 
0.0068 (0.0050) 
0.0034 (0.0025) 
0.0348 (0.0562) 
0.0140 (0.0187) 
0.0102 (0.0112) 
0.0059 (0.0056) 



[0.1475, 
[0.1742, 
[0.1884, 
[0.2017, 
[0.2703, 
[0.3387, 
[0.3727, 
[0.4020, 
[0.3738, 
[0.4986, 
[0.5511, 
[0.5988, 



0.5388] 
0.3826] 
0.3483] 
0.3123] 
0.8399] 
0.7273] 
0.6832] 
0.6208] 
0.9467] 
0.9368] 
0.9282] 
0.8988] 



[0.1633, 
[0.1803, 
[0.1939, 
[0.2055, 
[0.2320, 
[0.3409, 
[0.3798, 
[0.4084, 
[0.2135, 
[0.4556, 
[0.5349, 
[0.5997, 



0.4732] 
0.3592] 
0.3325] 
0.3035] 
0.8518] 
0.6988] 
0.6570] 
0.6044] 
1.1432] 
0.9923] 
0.9506] 
0.89371 



0.3912 
0.2084 
0.1599 
0.1106 
0.5696 
0.3886 
0.3105 
0.2188 
0.5729 
0.4382 
0.3771 
0.3000 



(0.3099) 
(0.1789) 
(0.1386) 
(0.0980) 
(0.6198) 
(0.3578) 
(0.2772) 
(0.1960) 
(0.9297) 
(0.5368) 
(0.4158) 
(0.2940) 
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HPD is a proxy for (ACC): In the average coverage criterion, we fix i and 
find the n making the coverage probability of the HPD set of length less 
than i at least 1 — a. Here, the E{HPD) represents the I for coverage 0.95 
for the approximate HPD interval centered at the posterior mean. 

It is seen that the expected (APVC) and (ALC) decreases as n increases, 
as does the error of approximation. Likewise, the expected HPD length de- 
creases, as does the error of approximation as n increases. When n = 10, the 
approximation can be poor with errors often over 25% of the true value; this 
may be due to the m or n being too small or due to convergence problems 
in the Markov chain Monte Carlo. At the other end, n = 100 gives good 
approximation in absolute and relative senses, suggesting the size of m is 
not the problem. Overall, in highly nonnormal and nonconjugate settings, 
our approximation may not give satisfactory results unless n is moderate, 
say, over 30. 

We comment that the effect size criterion involves the mean posterior 
quantiles, so we expect our formulae to give results similar to those for 
Eq^^[HPD)^ for which reason we omitted its presentation here. 

6. Final remarks. Overall, we have argued that simple, asymptotically 
valid inequalities can be derived so that Bayesian sample sizes can be readily 
determined essentially as easily as in the frequentist case. We have done this 
for four sample size criteria taken from the established literature. 

Apart from this contribution, we have several observations. 

First, integrating our approximations for (1.1) over gives expressions for 
use in pre-posterior Bayesian calculations where the expectations are taken 
with respect to the mixture density. That is, because F{W{-\X'^)) does not 
depend on the parameter explicitly, the expectation with respect to the 
mixture is EmF{W{-\X'^)) = ]QEeF{W{-\X'^))w{e)de, and our asymptotic 
expressions will apply to the argument of the integral. Our results are slightly 
stronger than necessary for evaluating marginal probabilities. 

Second, although we have not done it here, we suggest that, as ever, 
sensitivity analyses should be used to ensure the sample sizes obtained from 
any one method are robust against deviations of the prior, likelihood and loss 
function (if one exists) from the nominal choices used to obtain the sample 
sizes. Robustness against similar choices of sample size criterion would also 
be desirable. 

Finally, we anticipate that examining functionals of posteriors may be a 
step toward unifying the three cases described in the Introduction. Deci- 
sion theoretic procedures implicitly rest on the posterior risk which can be 
regarded as a functional of the posterior. Evidentiary procedures usually de- 
volve to posterior probabilities which can likewise be regarded as functionals 
of the posterior — we suggest formulae for these at the end of the Appendix. 
And, finally, purely Bayes criteria that focus on credibility sets also express 
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properties of credibility sets in terms of the posterior. It may be that a suit- 
ably general treatment of functionals of the posterior will include all these 
as special cases of one unified formalism. 

APPENDIX 

Here, we prove Theorem 3.3 and compare it with the expansions for two 
functionals in [5]. As a final point, we note how to use our techniques to 
get an asymptotic expansion for a functional that is the expectation of a 
posterior mean of a function of the parameter. 

Proof of Theorem 3.3. We need to approximate Egg{Is„ W^''\9o\X'^)); 
for simplicity of notation, we omit the Is„- 

First, for 1 < j < J, the 'yj{^/nI^/'^{6Q){6 — 9n),X"-ys are polynomials and, 
hence, differentiable. As in Assumption JE, the remainder term is 

= Wieo\xn-^e^j-.(^eo)/ni()o) 
J 

n>N,X''£Sn. 

So 7x+i(-,X"') has rth derivative whenever W{-\X"') does. 

To control the expectation of W^^\6\X'^), we replace the 7j(-,X")'s by 
the 7j(-)'s. That is, by the boundedness of the 7j(-,X"')'s, and the a.s. con- 
vergence of On and the /r(^n)'s to ^-iid the /r-(^o)'s, we have 

7,-(-,X")=7,(-)(l + Op(l)), 

for j = 1, . . . , J-|- 1, where the Op(l) may depend on j, but is independent of 
9. So we have 

W{9o\Xn = <^0,^j-^^g,yJ9o) 
J 



(A.l) 



+ J2^-^^'MV^l'^\Oo){9o - 9n)) 

x^,{^zl^l\9o){9o-9nm + o^p{l)) 
+ n-(-^+i)/27^+i(V^/i/2(^^)(^g - ^„))(1 + 



Next, we convert (A.l) into a form to which Assumption EE can be ap- 
plied. We begin to deal with derivatives of the first term by noting 

39'- 



_ 9„,/-i(eo)/ri 

=00 



89''-^ 
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Next, let ij^^ido) be the ith column of /^/2(6'o), and 1^ = (0, . . . , 0, 1, 0, . . . , 0) 
be the d-vector with the ith. component 1 and all other components zero. 
For the first derivative with respect to the ith. component of 9 we have 

%„./-i(eo)/n(^) 
d9i 

= |n/(^o)| 



So, by an induction argument we have 

dO--^ e=eo 

(A.2) = n\'\/^\I{6o)\'/^Hr^i{V^l'/\9o)i9o - k)) 

xct){VTil'/\9o){9o-9n)), 

in which we have simplified by using (d+ |r — l|)/2 = \r\/2. 

Using (A.2) in the first term, and recalling the definition of the r^j*^^ in the 
second term, the rth derivative of (A.l) becomes 

W^'\9o\X'') 

X (^{^l'/\9o){9 - 9n)) 



(A.3) 



(A.4) 



i=i 

X {l + Op{l))<P{V^l'/\9o){9o-9n)) 
(A.5) +n-(^+i)/2nl'-|/27M^(V^/i/2(0o)(eo-^n))(l + Op(l)). 



Here, 7j^]^(\/n-^^^^(^o)(^o — ^n)) is generated by applying the chain rule to 
the last term on the right-hand side in (A.l). Note that we differentiate with 
respect to 9 and then evaluate at ^o- Expressions (A.3) and (A.4) will give 
the two leading terms in (3.8), respectively. 
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Next we use Assumption EE to observe an identity: We can take expec- 
tations over 6n when it occurs in the argument of a polynomial Q{-) by the 
relationship 



( ^ 

Q{v)ct>{v) <^{v) + n-'/^Pk{v)<t>{v) + ^ 



(A.6) 

1 r ^/ V 



dv 



K 



(4 



where QoPk{-) is the polynomial obtained by their product in which, as be- 
fore, we have taken expectations and replaced powers. [The factor l/(47r)'^/2 
appears when we multiply two standard normal densities and observe the 
change of variables in the exponent.] 

We use (A.6) in (A. 3), (A.4) and (A.5) to get (3.8). 

Since the integrability of W^'''\-\X'^) and Hr-i{-)(f>{-) implies that of 
7j_i_i(-), we can apply (A.6) to see that the expectation of the error term 
(A.5) is 

Ee,{n-(^+'y'n\^^/^^UV^l'/\eo){eo - ^„))(1 + o,(l))) 



(A.7) 



k=l 



= 0(n<l'l-'-i)/2)^ 

In (A.7) we used the fact that the integral over '^^j]^i{v)(j){v) gives an 0(1) 
term. The integral over the summands in the summation gives terms of 
order 0(l)n'^/^, for k = . . . ,K . So, the initial term gives the order in n as 
indicated in (A.7). 

Similarly, using (A.6), the expectation of (A.4) is 



J 

E 



^(IH-i)/2 



+ o(n(IH-^-i)/2) 



36 B. CLARKE AND A. YUAN 

(A.8) 



J 

^„(kl-i)/2 



The leading term in (A.8) gives the second term in Ai in (3.8). 
Finally, using (A. 6), the expectation of (A. 3) is 

(A.9) 

' ' +o(n(H-^)/2), 



which gives the leading term in (3.8) and the first term in Ai. That is, by 
collecting terms in (A.7)-(A.9), the proof is completed. □ 

To exemplify Theorem 3.3, we examine the average behavior of the pos- 
terior density at ^o- Straightforward extensions give similar results at other 
values of 6. 

Consider the functional F(VF(-|X")) = ^(^01^") = ^^^^^}^\e=eo with 
r = (1, . . . , 1). Since Hr-i{-) = Hq{-) = 1, Theorem 3.3 gives 

(A.IO) 

When d = 1, we can verify that Ai = 0. This is easy because the expressions 
for the 7j(-)'s are available from [11] in this case. Indeed, we have 

r]i{v) = I{9o)c^q{ciov^ + cqiv) 

and 

in which Pi{v) = xai'/S!. The expectations of Pi{v) and r]i{v) when v is 
Normal(0, 1) are obviously zero. So, Pi{-^) = ~ ^ sjnd, thus, Ai = 0. 

This means that the two biggest terms in (A.IO) are of order n'^/^ and 
j^('^-2)/2^ We have not carried out the analysis far enough to identify the 
coefficient of the second-order term. 
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It is seen that (A. 10) is the same as the result in [5]. We remark that if 
one chooses F{W{-\X"')) = w'^{6o\X''^), the techniques above give 

(A.ll) Eg,{w\9o\X^))^Ee,{n''\I{eo)\4>\Z))- 



3'^/2(2^)'^' 
the same as in [5]. 

For completeness, we next show how to use the general procedure Proposi- 
tion 2.1 to get (A. 10). There are four types of terms in (2.10); we go through 
them in turn. 

The first term on the right-hand side of (2.10) is 

EFmz + V^l'/\9o){- - 9o))) 
(27r)'^/2 J {27r)d/^ 



(27r)'^/2 J (27r)'^/2 (47r)^/2 ' 

Next, for J > 1, the terms in the summation in (2.10) are of the form 

(27r)'^/2 J (27r)i/2" '^^ 

_ ,, n^/2|/l/2(^0)| f^-W2yzp.(^\^^ 

(47r)'^/2 J' '^'[V2j 
_ ,, n'^/2|/l/2(^0)| /gX 
(47r)'^/2 ^^^27' 

where Pj{^) is the expectation of Pj{-^) with the z^'s replaced by ct/'s, 
the lib. moments of A^(0,/d). 
Next, for /i(n), we observe that 

^{z + ^i^i\dQ){e-eQ)) 



(2^)'^/2 



e 

oo 



-(l/2)t't^^ 



|njV2(go)|l/2 re ,y2)iVTiJ^/\eo)(v-0o)+zyiV^I^/Hdo)iv-9o)+z)^^ 

(27r)^/2 



This gives 

d\'\^z + v^n/\9o){e-eo)) 



= n''/^\I^I\e,)\cl>{z), 
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and we have 



F{^{z + ^nI^eo){--e,))) 
l + \\z\ 



h{n) = / ,,^7^^ "^'^^ dz 



■ dz 



+ II-^IK 

^d/2|jl/2(^^)| .g-{l/2y. 



(27r)'^/2 J l + \\z 



\J 



dz. 



which is smaller than the leading term when multiplied by o(n"^/^) for any 
J> 1. 

It remains to evaluate the expectation of the remainder term. As assumed 
in the proofs of the theorems, we only need to evaluate it over the "good" 
sets, and we omit the indicators for them. Write 

\j=i 

+ n-^'+'y^jj^i{V^ry\eo){e - + o(i)) 

/ 0=00 

J 

= n-^/\^/^<t>{V^l'/\9o){9 - k)WP{^/^I^^'\Oom - + o{l)) 

i=i 

So, by Assumption EE and (A. 6), we get 

Ee,Rn = jz^^^^vf\cy/V2){l + o{l))+o{n^''-'y% 

which has lower order than the leading term for J > 1. Thus, by Proposi- 
tion 2.1, we get the same result as from Theorem 3.3. 

Our final point is that our techniques can be used to approximate the 
expected value of posterior expectations. Indeed, from (A.l), note that 

+ n-^i^n^i'\i^'\eomV^i''\e,){e - k)) 

(A.12) 

x^f\V^i^'\e,){e-km + o{i)) 
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The 7j(-,X")'s are from Assumption JE and are differentiable, as are the 

r]^j'\-)^s. Now, suppose h = h{9) has all rth partial derivatives, for \r\ < J, 
on a neighborhood of find that h{9)w{9\X^) and its partial derivatives 
are integrable with respect to w{-\X^). 

Then, Taylor expanding h at 9, justifying a use of Assumption EE and 
gathering terms suggests that 

EeJ [ h{9)w{9\X^^)d9) =h{9o) + n~y^ry\9o) ^ /iM(0o)^r 
(A.13) 

+ Ain ^ + o{n ^)+Rn, 

where 

(A.14) A, = r'/\9o) ^ 4\^)h^''HOo) + p-H9o) h^'HOo) 

\r\=l \r\=2 

and the r]j~\-)^s are as in Theorem 3.3. An extension of this argument gives 
similar expressions for higher-order terms. 
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